Preparations

Load the necessary libraries

library(car)       #for regression diagnostics
library(broom)     #for tidy output
library(ggfortify) #for model diagnostics
library(sjPlot)    #for outputs
library(knitr)     #for kable
library(effects)   #for partial effects plots
library(emmeans)   #for estimating marginal means
library(ggeffects) #for partial effects plots
library(modelr)    #for auxillary modelling functions
library(DHARMa)    #for residual diagnostics plots
library(performance) #for residuals diagnostics
library(see)         #for plotting residuals
library(tidyverse) #for data wrangling

Scenario

Polis et al. (1998) were intested in modelling the presence/absence of lizards (Uta sp.) against the perimeter to area ratio of 19 islands in the Gulf of California.

Uta lizard

Format of polis.csv data file

ISLAND RATIO PA
Bota 15.41 1
Cabeza 5.63 1
Cerraja 25.92 1
Coronadito 15.17 0
.. .. ..
ISLAND Categorical listing of the name of the 19 islands used - variable not used in analysis.
RATIO Ratio of perimeter to area of the island.
PA Presence (1) or absence (0) of Uta lizards on island.

The aim of the analysis is to investigate the relationship between island parimeter to area ratio and the presence/absence of Uta lizards.

Read in the data

polis = read_csv('../public/data/polis.csv', trim_ws=TRUE)
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   ISLAND = col_character(),
##   RATIO = col_double(),
##   PA = col_double()
## )
glimpse(polis)
## Rows: 19
## Columns: 3
## $ ISLAND <chr> "Bota", "Cabeza", "Cerraja", "Coronadito", "Flecha", "Gemelose…
## $ RATIO  <dbl> 15.41, 5.63, 25.92, 15.17, 13.04, 18.85, 30.95, 22.87, 12.01, …
## $ PA     <dbl> 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1
head(polis)
str(polis)
## tibble [19 × 3] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ ISLAND: chr [1:19] "Bota" "Cabeza" "Cerraja" "Coronadito" ...
##  $ RATIO : num [1:19] 15.41 5.63 25.92 15.17 13.04 ...
##  $ PA    : num [1:19] 1 1 1 0 1 0 0 0 0 1 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   ISLAND = col_character(),
##   ..   RATIO = col_double(),
##   ..   PA = col_double()
##   .. )

Exploratory data analysis

Model formula: \[ y_i \sim{} \mathcal{Bin}(n, p_i)\\ ln\left(\frac{p_i}{1-p_i}\right) = \beta_0 + \beta_1 x_i \]

where \(y_i\) represents the \(i\) observed values, \(n\) represents the number of trials (in the case of logistic, this is always 1), \(p_i\) represents the probability of lizards being present in the \(i^{th}\) poluation, and \(\beta_0\) and \(\beta_1\) represent the intercept and slope respectively.

ggplot(polis, aes(y=PA, x=RATIO))+
  geom_point()

ggplot(polis, aes(y=PA, x=RATIO))+
  geom_point()+
  geom_smooth(method='glm', formula=y~x,
              method.args=list(family='binomial'))

Fit the model

Model validation

Partial plots

Model investigation / hypothesis testing

Predictions

Summary figures

References

Polis, G. A., S. D. Hurd, C. D. Jackson, and F. Sanchez-Piñero. 1998. “Multifactor Population Limitation: Variable Spatial and Temporal Control of Spiders on Gulf of California Islands.” Ecology 79: 490–502.